在线广告收入占发布者的收入流越来越多的份额,特别是对于依赖谷歌和Facebook等技术公司广告网络的中小型出版商而言。因此,出版商可能会从准确的在线广告收入预测中获益,以更好地管理其网站货币化战略。但是,只能获得自己的收入数据的出版商缺乏出版商广告总市场的整体视图,这反过来限制了他们在他们未来的在线广告收入中产生见解的能力。为了解决这一业务问题,我们利用了一个专有的数据库,包括来自各种各样的地区的大量出版商的Google Adsense收入。我们采用时间融合变压器(TFT)模型,这是一种新的基于关注的架构,以预测出版商的广告收入。我们利用多个协变量,不仅包括出版商自己的特征,还包括其他出版商的广告收入。我们的预测结果优于多个时间范围的几个基准深度学习时间系列预测模型。此外,我们通过分析可变重要性重量来识别显着的特征和自我注意重量来解释结果,以揭示持久的时间模式。
translated by 谷歌翻译
Neural compression offers a domain-agnostic approach to creating codecs for lossy or lossless compression via deep generative models. For sequence compression, however, most deep sequence models have costs that scale with the sequence length rather than the sequence complexity. In this work, we instead treat data sequences as observations from an underlying continuous-time process and learn how to efficiently discretize while retaining information about the full sequence. As a consequence of decoupling sequential information from its temporal discretization, our approach allows for greater compression rates and smaller computational complexity. Moreover, the continuous-time approach naturally allows us to decode at different time intervals. We empirically verify our approach on multiple domains involving compression of video and motion capture sequences, showing that our approaches can automatically achieve reductions in bit rates by learning how to discretize.
translated by 谷歌翻译
Due to the high activation sparsity and use of accumulates (AC) instead of expensive multiply-and-accumulates (MAC), neuromorphic spiking neural networks (SNNs) have emerged as a promising low-power alternative to traditional DNNs for several computer vision (CV) applications. However, most existing SNNs require multiple time steps for acceptable inference accuracy, hindering real-time deployment and increasing spiking activity and, consequently, energy consumption. Recent works proposed direct encoding that directly feeds the analog pixel values in the first layer of the SNN in order to significantly reduce the number of time steps. Although the overhead for the first layer MACs with direct encoding is negligible for deep SNNs and the CV processing is efficient using SNNs, the data transfer between the image sensors and the downstream processing costs significant bandwidth and may dominate the total energy. To mitigate this concern, we propose an in-sensor computing hardware-software co-design framework for SNNs targeting image recognition tasks. Our approach reduces the bandwidth between sensing and processing by 12-96x and the resulting total energy by 2.32x compared to traditional CV processing, with a 3.8% reduction in accuracy on ImageNet.
translated by 谷歌翻译
This paper extends quantile factor analysis to a probabilistic variant that incorporates regularization and computationally efficient variational approximations. By means of synthetic and real data experiments it is established that the proposed estimator can achieve, in many cases, better accuracy than a recently proposed loss-based estimator. We contribute to the literature on measuring uncertainty by extracting new indexes of low, medium and high economic policy uncertainty, using the probabilistic quantile factor methodology. Medium and high indexes have clear contractionary effects, while the low index is benign for the economy, showing that not all manifestations of uncertainty are the same.
translated by 谷歌翻译
We introduce organism networks, which function like a single neural network but are composed of several neural particle networks; while each particle network fulfils the role of a single weight application within the organism network, it is also trained to self-replicate its own weights. As organism networks feature vastly more parameters than simpler architectures, we perform our initial experiments on an arithmetic task as well as on simplified MNIST-dataset classification as a collective. We observe that individual particle networks tend to specialise in either of the tasks and that the ones fully specialised in the secondary task may be dropped from the network without hindering the computational accuracy of the primary task. This leads to the discovery of a novel pruning-strategy for sparse neural networks
translated by 谷歌翻译
Overfitting is a problem in Convolutional Neural Networks (CNN) that causes poor generalization of models on unseen data. To remediate this problem, many new and diverse data augmentation methods (DA) have been proposed to supplement or generate more training data, and thereby increase its quality. In this work, we propose a new data augmentation algorithm: VoronoiPatches (VP). We primarily utilize non-linear recombination of information within an image, fragmenting and occluding small information patches. Unlike other DA methods, VP uses small convex polygon-shaped patches in a random layout to transport information around within an image. Sudden transitions created between patches and the original image can, optionally, be smoothed. In our experiments, VP outperformed current DA methods regarding model variance and overfitting tendencies. We demonstrate data augmentation utilizing non-linear re-combination of information within images, and non-orthogonal shapes and structures improves CNN model robustness on unseen data.
translated by 谷歌翻译
Generative models have been widely applied to solve extractive tasks, where parts of the input is extracted to form the desired output, and achieved significant success. For example, in extractive question answering (QA), generative models have constantly yielded state-of-the-art results. In this work, we identify the issue of tokenization inconsistency that is commonly neglected in training these models. This issue damages the extractive nature of these tasks after the input and output are tokenized inconsistently by the tokenizer, and thus leads to performance drop as well as hallucination. We propose a simple yet effective fix to this issue and conduct a case study on extractive QA. We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets, with a notable average of +1.7 F2 gain when a BART model is trained on SQuAD and evaluated on 8 QA datasets. Further, the model converges faster, and becomes less likely to generate out-of-context answers. With these findings, we would like to call for more attention on how tokenization should be done when solving extractive tasks and recommend applying consistent tokenization during training.
translated by 谷歌翻译
An optimal delivery of arguments is key to persuasion in any debate, both for humans and for AI systems. This requires the use of clear and fluent claims relevant to the given debate. Prior work has studied the automatic assessment of argument quality extensively. Yet, no approach actually improves the quality so far. Our work is the first step towards filling this gap. We propose the task of claim optimization: to rewrite argumentative claims to optimize their delivery. As an initial approach, we first generate a candidate set of optimized claims using a sequence-to-sequence model, such as BART, while taking into account contextual information. Our key idea is then to rerank generated candidates with respect to different quality metrics to find the best optimization. In automatic and human evaluation, we outperform different reranking baselines on an English corpus, improving 60% of all claims (worsening 16% only). Follow-up analyses reveal that, beyond copy editing, our approach often specifies claims with details, whereas it adds less evidence than humans do. Moreover, its capabilities generalize well to other domains, such as instructional texts.
translated by 谷歌翻译
We consider distributed learning in the presence of slow and unresponsive worker nodes, referred to as stragglers. In order to mitigate the effect of stragglers, gradient coding redundantly assigns partial computations to the worker such that the overall result can be recovered from only the non-straggling workers. Gradient codes are designed to tolerate a fixed number of stragglers. Since the number of stragglers in practice is random and unknown a priori, tolerating a fixed number of stragglers can yield a sub-optimal computation load and can result in higher latency. We propose a gradient coding scheme that can tolerate a flexible number of stragglers by carefully concatenating gradient codes for different straggler tolerance. By proper task scheduling and small additional signaling, our scheme adapts the computation load of the workers to the actual number of stragglers. We analyze the latency of our proposed scheme and show that it has a significantly lower latency than gradient codes.
translated by 谷歌翻译
With the rise of AI and automation, moral decisions are being put into the hands of algorithms that were formerly the preserve of humans. In autonomous driving, a variety of such decisions with ethical implications are made by algorithms for behavior and trajectory planning. Therefore, we present an ethical trajectory planning algorithm with a framework that aims at a fair distribution of risk among road users. Our implementation incorporates a combination of five essential ethical principles: minimization of the overall risk, priority for the worst-off, equal treatment of people, responsibility, and maximum acceptable risk. To the best of the authors' knowledge, this is the first ethical algorithm for trajectory planning of autonomous vehicles in line with the 20 recommendations from the EU Commission expert group and with general applicability to various traffic situations. We showcase the ethical behavior of our algorithm in selected scenarios and provide an empirical analysis of the ethical principles in 2000 scenarios. The code used in this research is available as open-source software.
translated by 谷歌翻译